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Executive Summary 

In this project, we addressed issues in coordinated scheduling for dynamic real-time systems. In particu- 
lar, We concentrated on design and implementation of a new distributed real-time system called R-Shell. 
The design objective of R- Shell is to provide computing support for space programs that have large, 
complex, fault-tolerant distributed real-time applications. In R-shell, the approach is based on the con- 
cept of scheduling agents, which reside in the application run-time environment, and are customized to 
provide just those resource management functions which are needed by the specific application. With this 
approach, we avoid the need for a sophisticated OS which provides a variety of generalized functionality, 
while still not burdening application programmers with heavy responsibility for resource management. 
In this report, we discuss the R- Shell approach, summarize the achievement of the project , and describe 
a preliminary prototype of R- Shell system. 

1 Overview of the Approach 

Real-time systems are generally designed and implemented using an entirely static approach. The system 
designers identify all of the functions which the system needs to perform, and create a set of tasks to 
perform these functions. The system designers design and implement these tasks, and then use several 
test runs to determine the worst-case timing requirements of each task. Then they create a statically 
predetermined schedule, using algorithms such as the cyclic executive, which ensures that every task will 
meet its deadline. The system is then tested exhaustively to minimize the possibility of timing faults 
during its operation. 

While this approach has been widely used in the past, there are inherent problems in trying to 
apply it to the increasingly complex systems of today, such as the Data Management System of Space 
Station Freedom. Exhaustive testing is extremely expensive, and even then only a limited amount of 
confidence is obtained. Predicting timing requirements of task- becomes increasingly difficult when the 
tasks become more complex, and when there is increased interaction among tasks, in the form of inter- task 
communication and synchronization requirements. 
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Furthermore, applications such as space systems have another major requirement: that of handling 
unexpected dynamic situations. There are many sources of dynamic behavior, including emergency situ- 
ations, mode changes, variations in workload, and faults. In space and other life-critical applications, the 
handling of emergencies and faults is crucial. Conventionally, the OS is a configuration-dependent entity, 
which coordinates the allocation of resources, and handles situations such as faults using general tech- 
niques which are independent of application semantics but may be dependent on resource characteristics, 
such as process migration, message rerouting, and replication of remote procedure calls. The application 
handles resource-related situations using techniques which may exploit application semantics, but are of- 
ten independent of the system configuration, such as fault recovery procedures, handling memorv and file 
allocation errors, and version selection for imprecise computation. Thus, this static functional separation 
of the O.S. and application makes efficient handling of emergencies and faults difficult. 

Our R-Shell approach represents an integration of the functionality of real-time applications and of 
the OS, with respect to resource management. This integration is accomplished by the use of .s chfduling 
agents. Scheduling agents reside in the application run-time environment, and are customized to provide 
just those resource management functions which are needed by the specific application. The scheduling 
agent implementation is customized to the particular OS and system configuration, thus exploiting OS- 
level knowledge. With this approach, we avoid the need for a sophisticated OS which provides a variety 
of generalized functionality, while still not burdening application programmers wit h heavv responsibility 
for resource management. Thus, instead of locking in the roles of the OS and the application, scheduling 
agents allow application designers to select the kind of behavior they want. 

The focal point of our R-Shell project is to address the following issues which are critical in distributed 
real-time systems: 

• Flexible scheduling strategy: With our R-Shell approach, the scheduling policies of the system 
can be modified easily by changing the scheduling agent functionality. For example, different 
programming languages can provide different scheduling agents to reflect, their design philosophy. 
It is also easier to utilize application semantics to make more intelligent scheduling decisions. For 
example, imprecise computation techniques are easily embedded in the scheduling agent. 

• Fully distributed scheduling: A centralized scheduler becomes a single point of failure for sys- 
tems. In our R-Shell system, the scheduling is distributed to all individual nodes and applications, 
if an application is itself distributed, each separate component has its own scheduling agent, and 
treats its need for data from other components as resource needs. This approach also makes it 
much easier and more cost-effective for applications to adapt and migrate between different execu- 
tion platforms. 

• Use of object-oriented model: The R-Shell approach is truly object-oriented, in that each 
application has its own scheduling agent, and thus makes its own scheduling decisions. If the OS 
acts a centralized scheduler, the resource correctness of each application object would depend on the 
OS and on the resource needs of other applications, which is not in keeping with the object-oriented 
philosophy that each object should be a self-contained entity that can be designed, implemented 
and verified independently. 

With these features, the R-Shell approach can address the problems crucial to space program' mrli 
as emergencies, mode changes, variations in work load', fault-tolerance, etc. 

With the support under this grant from NASA Ames, we have designed and implemented a prototype 
of R-Shell. The purpose of prototyping is to test the feasibility of R-Shell concepts and to provide 
information for a full scale design and implementation planned in the near future. Tliecurreni prototype 
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has been implemented on a UNIX-based system. We would like to stress, however, the principles reflected 
and the lessons learned in the prototyping are applicable to other environments as well. In the rest of 
the report, we will summarize design and implementation issues in the prototyping. We concentrate on 
scheduling agents and resource managers because they are the key components in R-SheU. 


2 Scheduling Agents 

Scheduling agents interface between the application and operating system. They are constructed to fit 
the needs of particular applications. The operating system capabilities they utilize and the functionality 
which they provide to the application can both be determined by applications designers based on the 
implementation platform and application requirements. However, the scheduling agents are not a part 
of the application. They are part of the run-time support system provided by the software development 
environment. 

Scheduling agents use the technique of multiple version selection in order to implement itiipncist 
computation to deal with dynamic situations. If a particular resource set is not available for an application, 
then an alternate version is chosen for execution. Imprecise computation enables applications to produce 
approximate results when the time or other resources available are insufficient for producing the original 
desired result [12, 14]. Using imprecise computation, we can design applications which provide predictable 
performance degradation. 

2.1 The Approach 



Figure 1: R-Shell Compilation Process 

In the prototype, an application is an arbitrary C program that is logically correct. A scheduling agent 
is realized by inserting some code into the source code of an application program which calls routines 
in R-Shell run-time libraries. This is being done by a translator which reads application requirements 
from a file and then inserts the code for the scheduling agent into the application. The requirements 
file may be generated by the programmer or by an application analyzer. The requirements file includes 
programmer directives to allow the programmer full control of the scheduling agent. The modified 
application program with an embedded scheduling agent is then compiled with R-Shell librane.- to produce 
a real-time application. See Figure 1 for a diagram of the compilation process. 

The resulting real-time application car. be viewed as Figure 2. Scheduling auents intorfa-v directly 
with applications as code that is inserted into each application, and then compiled with the K-ShHI 
libraries. Scheduling agents then communicate with the R-Shell libraries via procedure calls. a> described 
in Section 2.3. 
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Figure 2: Sample Real-Time Application 


2.2 Requirements Files 

The requirements file provides several pieces of information to the scheduling agent that aid in schedul- 
ing the application and its procedures. Resource requirements and programmer directives have been 
incorporated into the requirements file. Resources include CPU time, memory and network bandwidth. 
The programmer can also specify the quality of a function's result, the relative priority of a function 
and the deadline for each procedure. An alternate version of a procedure and an exception handler can 
be specified by the programmer to implement either multiple version selection and resource exception 
handling. Table 1 describes each field in the requirements file. 

Table 1: Requirements File Specification 


Field Name 

Description 

CPU 

Amount of CPU time (in seconds) required 

Memory 

Amount of memory (in kilobytes) required 

Network 

Amount of network bandwidth (in kilobytes) required 

Quality 

Percent quality of the function’s output 

Priority 

Application level priority based on a relative scale 

Deadline 

Application level deadline; the units are in seconds 

Alternate 

Version 

Alternate function to call if a resource request 
guarantee cannot be obtained 

Exception 

Handler 

Exception handler to call when a resource exception occurs 


The requirements file was constructed at the procedure level. Application scheduling is achieved by 
placing resource requirements on the program’s raainO function. 

A sample requirements file is shown in Figure 3. Consider the row that specific- function main. If 
indicates that the function needs 40 units of time and 100k byte memory to be executed. Tim quality of t he 
result produced by main will be 100%, i.e. not an approximation. The relative priority i- ^ am! deadline 
is 200 units of time. If main fails to obtain a resource guarantee, the scheduling agent will attempt to 
schedule the alternate version, altunain. If either of these functions generate a resource exception, then 
the user defined exception handler recover () is called. The gridjresolve and half unatrix functions 
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# 

# Function 

# lame CPU 

# 

main ; 40 

alt .main ; 25 

grid.resolve; 100 
half.matrix ; 74 

Figure 3: Sample Requirements File 

are both aperiodic. The alternate version for grid_xesolve is half ^matrix (m/2) . Note the parameter 
m/2 in the function call. This allows the alternate version to work with a different set of parameters than 
the main function. The memory requirements for these two functions are 32*m. Thus, the programmer 
is allowed to specify parametric (dynamic) resource requirements as C-style expressions. 

The exception handler field is used to specify a routine to be called when a resource exception occurs. 
The default exception handler (proc.abort () ) aborts the currently executing procedure. 

2.3 R-Shell API 

The application programmer interface (API) is a set of routines that R- Shell provides for scheduling 
agents. These routines are listed in Table 2. 


Alternate Exception 

MEM IET QUAL PRI DED Version Handler 


; 100 ; — 

100 

8 

200 ; 

alt_main ; recover () 

; 160 ; 

80 

5 

200 ; 

; recoverO 

; 32*m ; ; 

100 

2 

40 ; 

half .matrix (x/2) ; 

; 32*m ; — ; 

50 

3 

40 ; 

; resource. EH() 


Table 2: API for R-Shell Applications 


Function Name 

Description 

initialize.SA 

Called from main() to initialize interprocess 
communication with Resource Manager 

resource_request 

Called from each procedure with a scheduling agent 
to request resource guarantees 

save_environment 

Used by scheduling agent for procedure level scheduling 
to save currently executing environment 

request_exception 

General purpose exception handler called when 
resource guarantee cannot be obtained 

notify JIM 

Used to notify Resource Manager of procedure completion 

abort_procedure 

Used to abort the currently executing procedure 

proc.abort 

Default exception handler called when a 
resource exception occurs 


3 Resource Managers 

3.1 The Approach 

In the prototype, there is a general resource manager which acts as an interface between scheduling agents 
and individual resource managers. Individual resources include CPI' lime, memory and network band- 
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Bandwidth 


Message 

Type 

Process 

ID 

CPU 

Time 

Start 

True 

Deadline 

Memory 

Memory 

Used 


Priority 


Level 


Message Type 


Process ID 

CPU Time 
Start Time 
Deadline 
Memory 
Memory Used 
Bandwidth 
Priority 
Level 


: 0 *> Resource request for an aperiodic task. 

1 -> Resource request for a periodic task. 

2 -> ApplicatioiVpfoceckjre termination notification; aH other fields empty. 

: Helps the resource managers associate resource requests with processes; 
also used for allocating and controlling resources. 

: CPU time required (in seconds). 

: Relative start time tor the application/procedure (in seconds). 

: Relative deadline for the application/procedure (in seconds). 

; Memory required (in kbytes). 

: Total memory already being used by the process; tracked by the scheduling agent. 
: Network bandwidth required (in kbytes). 

: Application's priority. 

: Level of the currently executing procedure within the application. 

Figure 4: Resource Request Message Format 


width. Resource managers use cooperative resource management in order to provide resource guarantees. 

Scheduling agents interact with the general resource manager, which then forwards the resource 
requests to individual resource managers to obtain guarantees of application resource requirements. Re- 
source managers can also provide information about resource availability, and accept messages from 
applications specifying information about resource usage, such as preferences for certain resources. 

The approach to the dynamic scheduling problem has been that of scheduling at task arrival time. 
As each task arrives, the system attempts to guarantee it. If the guarantee cannot be provided, one 
possibility is that the invoker of the task can attempt to guarantee an alternate version with different 
resource requirements, if one exists. This technique is called multiple version selection. 

Resource managers use the concept of delayed guarantees when resources are scheduled. If a resource 
request cannot be granted, then the application is notified immediately so that it can take corrective 
action. If the request can be scheduled, then the application is notified only when it should start execut- 
ing. This eliminates an extra acknowledgment message from going over the network and simplifies the 
scheduling agent. 

Resource managers send exception notification messages to applications if a guarantee cannot be 
satisfied due to faults, or preemption of resources by higher priority tasks. Under these circumstances, if 
the resource manager cannot maintain the guarantee, it sends a message to the application notifying it 
of the resource exception. These messages enable applications to perform exception handling. 

3.2 Implementation 

When the resource manager starts executing, it initializes its data structures for scheduling, sets up the 
UDP socket for inter-process communication, and sets up >i gna! handlers to handle asynchronous I/O. 
The resource manager then waits for resource requests to arrive and dispatches these jobs. 

The format of the resource request message is shown in Figure 4. When a resource request arrives. 













it will be entered into a buffer space for the dispatcher to handle at a later time. This design was used 
to keep the resource manager from missing messages. The dispatcher attempts to schedule jobs in the 
request buffer. 

The resource manager communicates with the scheduling agent using messages. If a job cannot be 
scheduled, it is sent a REJECT message immediately; otherwise it will be sent a GOAHEAD message at the 
appropriate start time. This method of delayed guarantees is an implied guarantee while the application 
is blocking on the resource request. This technique provides a graceful way to preempt applications before 
they have started by simply sending a REJECT message to the application. Message types that are sent 
from the resource manager to scheduling agents are listed in Table 3. 

Once an application starts executing, it will execute until completion. The resource manager will 
sleep until either the currently executing procedure completes on its own or exceeds its deadline. In 
the latter case, the resource manager sends an ABORT message to the application thereby generating a 
resource exception. 


Table 3: R- Shell Message Protocol 


Message 

Description 

GOAHEAD 

Delayed guarantee. Procedure may start execution 

REJECT 

Resource request cannot be guaranteed 

ABORT 

Abort procedure level 


4 The Translator 

4.1 The Approach 

In the prototype, the R-Shell translator is implemented as a finite state machine that parses a C program 
and performs the following actions: 

L Reads the requirements file into memory (rfp.c). 

2. Inserts iinclude "rshell.h" as the first line of code in the application program. 

3. Begins parsing the application code. 

The translator parses C code by looking for function declarations. The translator ignores comments 
and string constants. The translator keeps track of braces { > to determine the level of code nesting. 
Functions can only be declared on level 0. The translator looks for function names by looking for an 
alpha-numeric string followed by a (. The translator saves the parameter list for the function call to be 
used later. 

When procedure main() is detected, the function call init ialize_RM() is inserted as the first line 
of code in the procedure. This call initializes communication wit:, the resource manager. When a 
procedure declaration is detected, the requirements file is searched to see if that procedure has any 
resource requirements. If a match is found, then code is generated to issue a resource request a> an if 
statement. The application code is indented and placed in a new lev#-; of braces*. 

Return statements are then searched for to convert them to return, statements so that procedure 
completion notification code can be generated, return, is a macro defined in rshell.h that notifies the 




resource manager only after the return value is computed. The final closing brace of a function is also 
searched for to insert a notify_RM() procedure call. 


4.2 Language Constructs 

This section describes the code that is inserted by the translator. The C code implements the language 
construct for various purposes it is designed for. This section also describes return values for functions 
with scheduling agents. 


4.2.1 Multiple Version Selection 

if ( ! resource.request ( 

/* Resource Requirements */ )) { 
return alternate.versionO ; 

} else { 

/* Application code */ 

> 


In order to implement multiple version selection, if statements are inserted into source code as blocks 
around application code. If the resource request, fails, then an alternate version is executed, otherwise 
control flow continues to the user application. 


4.2.2 Resource Exception Handling 

if ( ! resource. request ( 

/* Resource Requirements */ )) { 
return request. exceptionQ ; 

} else { 

/* Application code */ 

> 


Resource exception handling is similar to multiple version selection. If a resource request fails, then 
an exception handler is called. There can be a chain of failed resource requests using multiple ver- 
sion selection. The final version that is called is an exception handler. The default exception handler, 
request_exception() returns REQUEST-EXCEPTION to the caller without executing the procedure. The 
programmer may specify their own exception handler in the requirements file. Exception handlers have 
no stated resource requirements, thus they are guaranteed to execute. 


4.2.3 Procedure Level Scheduling 

if (set jmp( save. environment () 

->environment) ) { 
return proc.abort ("main") ; 

} else { 

/* Application code */ 

> 

notif y.RM( ) ; 


Once a procedure obtains a resource guarantee, the scheduling agent must ensure that the resource 
consumption does not exceed the stated requirements. If any requirement, such as CPU time or memory 
used, is exceeded, then the procedure is aborted. 
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In order to abort a procedure, the scheduling agent must save the environment of the application 
just prior to executing the application code. The Unix system calls setjmpO and longjmpO are used 
to achieve this. The initial call to setjmpO saves the current environment and returns 0. This causes 
the if statement to fail and starts executing the application code. A subsequent call to longjmpO with 
the proper environment will cause control to return to the if statement and cause setjmpO to return 
1, thus aborting the procedure. 

The save_environment() procedure maintains a linked list of environments so that procedures may 
be aborted at any level. When a procedure finishes, it calls notify_RM() to restore the appropriate 
environment and to notify the resource manager that the procedure has completed. 

The default exception handler called when an application generates an exception at run-time is 
proc.abortQ. This exception handler simply prints an abort message and returns the value RES0URCE_EXCEPTI0N 
to the caller. The programmer can specify their own exception handler in the requirements file. 

4.2.4 Use of asynchronous I/O to control applications 

Signal handlers are used to handle asynchronous I/O. Applications receive GOAHEAD, REJECT and ABORT 
messages from the resource manager. See Table 3 for a description of message types. When an application 
receives an ABORT message, it determines which procedure level to abort to. restores the environment stack, 
and calls longjmpO to return to the appropriate procedure. 

5 Final Remarks 

In most real-time systems, the OS and the application share the responsibility for resource management, 
with each having its own well-defined role in the resource management process. They act as separate 
units, rather than co-operating to exploit the knowledge of each or jointly implementing the desired 
functionality. In R-shell, the approach is based on the concept of scheduling agents. The scheduling 
agent implementation can be customized to the particular OS and system configuration, thus exploiting 
OS-level knowledge. 

From our experience of a prototype R-Shell system, we conclude that this approach is useful in building 
flexible, fully distributed, object-oriented real-time applications. 
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